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l. INTRODUCTION 


In the numerical analysis of many physical 
problems, oftentimes the formulation will lead to 
simultaneous, linear, algebraic equations. Field 
problems governed by Laplace, Poisson, and _ bi- 
harmonic equations are common examples. Nu- 
merical solutions of the diffusion equation using 
implicit representation offers another instance. In 
general, they find their application to the solution 
of various types of differential equations, both ordi- 
nary and partial. Due to the advent of modern 
high speed computers, seeking the solution of these 
equations becomes, in many cases, a daily routine 
even if the number of unknowns gets large. 

The usual method for the solution of such linear 
equations is the Gauss elimination procedure. How- 
ever, difficulties arise when the true solution is 
sensitive to round-off errors injected during compu- 
tation. The latter may accumulate to such an 
extent that the numerical answers become com- 
pletely worthless. Besides, the solution may also 
be sensitive to small errors in the coefficients and 


i 


quantities at the right hand side. In either case, 
the determinant of the coefficient matrix is nearly 
singular. Such equations are described as_ ill- 
conditioned. 

The occurrence of ill-conditioned equations in 
physical problems are by no means random. As 
pointed out by Turing,“* there is a large class of 
problems which naturally give rise to highly 1ill- 
conditioned systems. Head and Oulton) indicated 
their appearance in problems associated with air- 
craft design. In the evaluation of local tempera- 
tures at the sliding interface between a cutting tool 
and flowing metal chip, ill-conditioned systems 
were likewise encountered.‘*) While the subject has 
received the attention of many people in recent 
years and some means of measuring “ill-condi- 
tionedness” have developed, one immediately dis- 
covers the lack of adequate information when 
attempting to solve such equations with high speed 
computers. 


* Numbers in superscript refer to References Cited. 


ll. LITERATURE SURVEY 


Attempts which have been made at solving ill- 
conditioned, simultaneous, linear, algebraic equa- 
tions fall into two main categories. First, there 1s 
the indirect method, in which the unknowns are 
obtained by successive approximations. The second 
is the direct approach, in which the solution is 
obtained by a single application of the process. 

An example of the indirect method is the widely 
publicized relaxation procedure of Southwell. 
Shaw, Buckingham,” and Fox gave detailed 
accounts of the procedure when applied to ill- 
conditioned systems. They all concluded that the 
convergence might become too slow to be of any 
practical value. In fact, to ensure convergence of 
the process, the coefficient matrix must be positive 
definite and symmetric.“ While any matrix can 
be converted into a positive definite and symmetric 
form through multiplication by its transpose, the 
degree of ill-conditionedness is always aggravated 
as a consequence of such operation.’ The relaxa- 
tion procedure is also not suitable for machine 
computation. 

Another indirect method is the escalator process 
developed by Morris.“ It consists of expressing 
the solution as a linear combination of column 
matrices or vectors which are orthogonal. A similar 
technique has been described by Fox, Huskey, and 
Wilkinson.“”) They are applicable only to sym- 
metric matrices. The accuracy of Morris’ method 
has been examined by Neville,“?) who criticized its 
lack of provision for estimating errors. Morris’ 
procedure has also been criticized as not being well 
suited for automatic machine computation.“ 

Booth proposed the method of steepest de- 
scent for solving ill-conditioned equations. The 
theory was based on positive definite and sym- 
metric matrix form. Due to round-off errors, the 
actual path of solution may oscillate. 


A unified convergence criterion valid both fo1 
iteration by total step (Jacobi) and iteration by 
single step (Gauss-Seidel) has been given by Gei- 
ringer.““*) Group iteration was also discussed. I 
the degree of ill-conditionedness were severe, the 
criterion would most probably not be met. 

Direct methods of solution have also been ex- 
amined. Buckingham"? discussed the use of the 
Gauss elimination procedure and the difficulty ot 
using it to solve ill-conditioned systems. Nielsen “*? 
favored Crout’s modification of the Gauss elim- 
ination procedure. This modification will allow 
measurement of the degree of ill-conditionedness 
by the relative magnitude of the elements of the 
main diagonal as compared to the rest of the ele- 
ments in the derived matrix. A re-arrangement of 
equations would be made if one of the diagona 
elements is either very small or large. Bodewig''® 
also discussed the possibility of suitably re- 
arranging the rows and columns in order to reduce 
inherent errors in computation. Neither scheme i; 
suitable for machine computation. 

A method which combines iteration and elim- 
ination has also been suggested.“”) The: approxi: 
mate solution obtained from the computer i 
substituted into the original equations and_ the 
residuals calculated using double-precision compu: 
tation. Corrections for the solutions are ther 
calculated from these residuals. Forsythe“* give: 
a survey of the many methods available for thi 
solution of systems of linear, algebraic equations 
and includes a rather extensive bibliography. Thi 
paper describes a new approach to the problem. I 
is condensed and modified from a technical re 
port" to which two of the authors contributed. 
many cases, Improved results are obtained. Furthe 
work needs to be done to fully explore its merit. 


Ill. A MEASURE OF ILL-CONDITIONEDNESS 


In a two dimensional space, the ill-conditioned- 
ness of linear equations may be readily visualized 
geometrically. It is associated with the near 
“parallelism” of the straight lines represented by 
the equations. In an n-dimensional hyper-space, 
the linear equations may be interpreted as hyper- 
planes. The coefficients in the equations are pro- 
portional to the components of vectors normal to 
such planes. When the equations are ill-conditioned, 
two or more of these normals are nearly in the 
same direction. 

Consider the set of simultaneous equations: 


Dos ij VU; =f; (7 = if 2s kerr n) (3.1) 
q=1 
which in matrix form may be written: 
AX =) (3:2) 


where A is the coefficient matrix [a;;|; X and F are 
column matrices. As pointed out by Booth,“ the 
set (3.1) will be ill-conditioned if the absolute value 
of the determinant of the normalized coefficient 
matrix is very small as compared to unity. That is, 


| An lebe << i (3.3) 

Other quantitative measures of ill-conditioned- 
ness have also been discussed in literature. Von 
Neumann and Goldstine®” proposed the use of 


the so-called condition number, defined by 5 aR 
where Amax and Amin are respectively the largest 
and smallest eigenvalues of the coefficient matrix. 
The larger this number, the higher the degree of 
ill-econditionedness is. Unfortunately, such a cri- 
terion is of little practical value, since the numerical 
determination of Amax and Amin May oftentimes be 
as long a process as the evaluation of the solutions 
of the original equations. °” 

Employing the concept of the condition num- 
ber described above, Taussky“) developed a useful 
theorem concerning ill-conditioned matrices. If A 
represents a real, non-singular matrix, and A’ its 


transpose, then their product AA’ is more “‘ill- 
conditioned” than A. This has been confirmed by 
Hartree.°? It seems that this important result 
has not received the attention it deserves. 

Turing™ suggested the use of other condition 
numbers which also involve the multiplication of 
the coefficient matrix by its transpose. In view of 
Taussky’s finding, such use has been discarded in 
favor of the simple criterion suggested by Booth 
Cig 3:3); 

While the inequality (3.3) has been widely used 
to ascertain ill-conditionedness, it is by no means 
a sufficient condition under all circumstances. This 
may be demonstrated by the following examples. 

Consider the set: 


4.011 x + 4.012 x = 0.001] 


(a) 
4.012 21 + 4.014 x = 0.002| 


which has the exact solution, 7;= —1 and 72.=-+1. 
The solution as obtained from the ILLIAC,! using 
the method of elimination and with round-off at 
the 12th significant figure, has been found to be: 
21 = —1.00000001819 and xz, = +1.00000001819. 
If errors of +3 to 10 hundredths of one percent 
are injected into the coefficients and quantities on 
the right hand side, the system will read: 


4.012 2, + 4.011 x, = 0.001001 


ph ti 
4.013 2, + 4.013 22 = 0.002002] 
The corresponding ILLIAC solution is x = 
— 1.0000022347 and xz. = +1.0005011134. These 
agree with the exact solution to a high degree of 
accuracy. On the other hand, when one computes 
Ay | of (a), it is found to be equal to 0.000705; 
very small indeed as compared to unity. 

Let us now consider a different set which has 
the same coefficient matrix as that of (a). Thus, 


4.011 2 + 4.012 x = 4010] 
4.0127, + 4.014 2% = 4010] 


1The digital computer at the University of Illinois. 


(c) 
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which has the exact solution, 7; = +2000 and 
a, = —1000. The ILLIAC solution reads x = 
+2000.00003641 and 2. = —1000.00003638. Again, 
if errors of +3 to 10 hundredths of one percent 
are injected, we obtain 


4.012 x, + 4.011 x. = 4011 
4.013 2, + 4.013 vz = 4012 


(d) 


The solution as obtained from the ILLIAC now be- 
comes: 2,;= +999.5016083 and x2.= +0.0002492015. 
These in no way resemble the actual solution. 
Lastly, if errors are only introduced into the coeffi- 
cients, i.e., if we seek the solution of the set: 


4.012 2, + 4.011 2, = 4010] 
(e) 
4.013 21 + 4.013 22 = 4010| 


we obtain: 2, = +1998.5048432 and x, = 
—999.2524135. The foregoing examples clearly 


illustrate the inadequacy of describing the degree 
of ill-conditionedness by |A | alone, without any 
reference to the quantities on the right hand side, 
namely, the f,’s. 

As has been mentioned earlier, the ill-condi- 
tionedness of (a) is geometrically associated with 
the near parallelism of the lines represented by the 
equations. Changes in the quantities f,’s will exe- 
cute a parallel shift of these lines. A question then 
naturally arises: Why is the ill-conditionedness of 
the set (a) not significantly influenced by the 
changes in f,’s as demonstrated, while the set (c) is? 
The reason becomes obvious if one observes that, 
in the latter case, the two f,’s are equal, and con- 
sequently, in the process of computation, one en- 
counters the difficulty of taking small differences 
of two large and nearly identical numbers. 

A mathematically rigorous analysis of ascer- 
taining the ill-conditionedness of any set of linear 
equations needs separate research. 


IV. A NEW ITERATIVE PROCEDURE 


We shall now describe how to modify a given 
ill-conditioned matrix in order to make it a better 
conditioned one. Then we shall apply a new itera- 
tive procedure to the modified system to obtain a 
solution. It should be noted that the method does 
not require that the matrix be symmetric, positive 
definite, or both. 

Consider the ill-conditioned normalized matrix 
of real numbers 


Ay = [aij] (4.1) 
Let us add to Ay the diagonal matrix 
Vit 0 a 0 
0 EVO Mies Ts. 0 
Y= (4.2) 
OM Oe ae. ys, 


where the y;’s are, for the moment, arbitrary real 
numbers. Let us now form the modified matrix? 


Guty die oo a Chis 
Coy Goo Y¥2 2. « Gon 

Ay +T= (4.3) 
Ant An2 5. Onn Yn 


A measure of the improvement in the ‘condi- 
tion’ of a system whose matrix of coefficients is 
given by (4.3) may be expressed by the ratio of 
the absolute values of the determinants of Ay and 
Ay-+TI, namely, 

| jee 
| An + T ate 


For, clearly, if @<1, the improvement will be 
significant and the originally ill-conditioned matrix 
will become a well-conditioned one. £8 is called the 
conditioning index. 

The problem now is one of selecting the y;’s 
appropriately so that 
un eee Pied te nae eee 


proposed method was similar to the one used by Riley.C*) However, 
Riley considered only the positive definite and symmetric matrix. 


abs = A N 


Ay ++ L 


abs (4.4) 


Unfortunately, this problem is extremely in- 
volved and, instead, we solve a simpler one which 
in practice is oftentimes adequate. We take y,= 
Y2=...=Y¥n=y and show that y, under condi- 
tions stated below, can always be chosen so that 
(4.4) obtains. 

For this case of equal y,’s (4.3) becomes Ay +y/, 
where J is the unit matrix, and, as is well known, 


Ayep yl | abe = | apse iOS at ak Os ee: 
ee ee ey en =|P (y)|, (4.5) 


wherein the p;’s are constants depending only on 
the elements @;; and the vertical bars on the right 
side of the equal sign are now absolute value signs. 
Clearly, 


=| Ay 


|P(0)| =p. abs # O 

since we are considering non-singular matrices. 
Therefore, if pr0, there will be no relative 
maximum or minimum of P(y) at y=0. Hence, in 
view of the continuity of P(y), there exists a o>0, 
such that 


Aw | ae, (4.6) 


| P (y) be > 
or 


| P (y) pee acy libs (4.7) 

In practice, one computes the determinants 
| Aw+yI abs Lor y= to and selects the larger of 
the two quantities. 

If p»1=0, which corresponds to a rare matrix 
form, P(y) may have a relative maximum or mini- 
mum, or an inflection point at y=0. If a maximum 
occurs, it is not possible to improve the matrix by 
the present method. 

Let us suppose that the ill-conditioned system 
of equations 


n 


SS. Aig U7 = fi G = is 2, a 


j=1 


Sy 1) (4.8) 


is modified according to the present method. Then 
the improved system will read: 
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(au+y) Li tGie Le+...+Ain In =fityi vy 
Qo rae) Po-s. a tn eee AD) (4.9) 
Ant Li+Ane a a Or ae Cn = LA Un 


Note: In what follows the restriction of equal y,’s 
will not be imposed because the analysis is per- 
fectly general and does not require it. 

Mathematically speaking, equations (4.8) and 
(4.9) are equivalent. However, their behavior with 
respect to arithmetical operations in machine com- 
putation may be entirely different. This arises 
from the fact that the coefficient matrix in (4.9) 
is better conditioned. 

To solve the system (4.9) we shall adopt the 
following iterative procedure. In the first step we 
delete the terms y,v7;(t=1, 2,..., ) on the right 
side of equations (4.9) and solve the resulting 
system on the ILLIAC. The solution so obtained 
we designate by 

£0 
£ 


BO | (4.10) 


&, 


If the exact solution of (4.8) is X, the difference 
X —Z” will then be the error in the first iterative 
solution and, accordingly, we write 
oe) 

ey (1) 

EO =X — FO = (4.11) 


ey) 


By comparing the original set of equations (4.8) 
with the set satisfied by Z™, we obtain 


ir C1 + Aye C2 +... + din Cn = 71 &) 


oy C1? + digo p+, . -+Gon en? = Y2 £(D (4.12) 
Ant ey) + dno 2) See o + SPO €,Y) =n E, 
or in matrix notation 
AK® = T#® (4.13) 


Geometrically, we may interpret this as a 


ILL-CONDITIONED LINEAR ALGEBRAIC EQUATIONS 


translation of coordinates from X to E™. We 
observe that the coefficient matrix of (4.12) is 
the same as that of (4.8), and hence it cannot be 
solved directly on the computer. To solve (4.13) 
we simply proceed as before, and write 

(A +1) Z® = Teo (4.14) 
where again the term on the right side, namely, 
r=@ has been deleted. The system (4.14) is again 
solved by the ILLIAC. Let us denote this solu- 
tion by 


m2) = | * (4.15) 


£,° 


The difference H®—Z® is the error incurred in 
calculating = and we designate it by H®. There- 
fore, H® =H®Y —Z@ =X —ZM — If the fore- 
going procedure is repeated m times, we obtain 


(2) 
jal 


m 


y i) 


j=1 


7 AC) 23 Hom) = '(m) = BG => 


or 


m 


-“2& (7) + Em 


(4.16) 


If the method converges, then the solution of the 
original system is ~ 
X= Ro (4.17) 

The condition under which convergence takes 
place and the expression for the error term after 
m iterations will be given later. 

At this point we list the formulae for the deter- 
mination of Z’s, namely, 


PF when m = 1, 


(A+T) EM = (4.18) 


rE? when mm S16 

An important feature of this method is that the 
coefhcient matrix for the determination of succes- 
Sive iterations remains the same. This fact makes 
it possible for the complete iterative computation 
to be readily programmed on the ILLIAC. 

As has been pointed out previously, in the 
present method of calculation there is a displace- 
ment in the origin of the coordinate axes for the 
unknowns in the equations after every cycle of 
iteration. Errors will thus likely be accumulated 
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in the sum 2a as the process continues. This 
source of error may be conveniently removed by 
starting anew with (4.9) after the r-th cycle, but 
this time with the terms y.v; (¢=1, 2,..., ) at 
the right hand side of the equations not set equal 
to zero, but replaced by the terms appearing in 


the column matrix IE: =”. The iteration pro- 
ceeds in the usual mation and this artifice may 


be repeated if necessary. 


A. THE GEOMETRY OF THE NEW 
ITERATIVE PROCESS 


Before discussing the convergence criterion, it 
is interesting and instructive to describe geomet- 
rically the nature of the convergence of the new 
procedure. For ease of illustration we shall con- 
sider a system of two equations in two unknowns. 
Suppose, therefore, that the given system is 


Ay Ly + Ay X = f| (i) 
Ag, X11 + Ay 2 = fo| (ii) 
Modifying it for the first iteration, we have 
(ai a v1) &® + ais £1) = St, Gye 
21 £0) =f (do2 iy 2) £0 = fa, Ci)? 


in which y; and 72 are of the same sign as that of 
M@;, and dy». In Fig. 1, the original equations are 


PU Sp yikod 


Fig. 1 


represented by lines (i) and (ii), while the modi- 
fied set is represented by lines (i) and (ii). 
The latter intersect at a larger angle and hence 


they are better conditioned. Their point of inter- 
section (&%, &%) represents the result of the 
first iteration as well as the origin of the new 
coordinate system e and e?. The original equa- 
tions, when referred to the new coordinates, are: 


Qy1 1 + ay eo 


Vp en) 


ar 1 + dag €2°) = yo &Y), 


For the second iteration, the modified equations 
become: 


(du as v1) £,) + de £,°) =%71 &, 
dar £1 + (dog + yo) fo = yo be, 


OH 
(ii) (2) 


which are represented by lines (i)® and (i1)®. 
Their intersection is the point (&°, &@)). Provided 
the method yields a convergent sequence, then, 
by repeating the foregoing procedure a sufficient 
number of times, the successive intersections tend 
to the required solution (2, 22). 


B. THE CONVERGENCE CRITERION 
AND ERROR ESTIMATION 


On the basis of equations (4.18), we have 


=(A +1) F =C = [eal 
and (4.19) 
t(m) = (A + tee Ny =A cap m>1 
Let (A+T)“r=B=[b;;]. Then for m>1, we 
have 
= (m) —_— Bm (m—1) 
a (m—1) — Ba On?) 
(3) = Bae) 
£2 = BAY = BC 
Therefore, 
ii (m) a eyes ed m Z if 
where B°=J, and 
SRO = (TEBE BL... LBC. 


j=l 


(oe) 
Let us now assume that >> B’ converges. Then, 
j=0 


in view of the fact that X =C+ BX, we have 


= > BIC, 


7=0 


SB) 0 


and therefore 
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m co m—1 
ges ol VP Ss B') C 
7=0 j=0 


j=1 


= 5° BC Sr — By eC 


j=m 


= Bn (I — B)"C = B™, (4.20) 


Now, lim B™=0 because >) Bi< ~. 


m—Cco 7=0 


Therefore, taking the limit of both sides of (4.20) 
we see that lim Ho” =0 


m—>cO 


as NM ©, 


and 


co 
X =Y zo 


Consequently, we have the following Theorem. 


If > B' converges, then the series DE =” converges 
j=o j=1 


to the solution X. Furthermore, the error H” 
committed after m iterations is given by either 
(J=B) BrCoor Bed —BjC- 

In summary, it should be emphasized that the 
method here discussed requires the satisfaction of 
two conditions, namely, (1) the improvement of 
the condition of the matrix and (11) the conver- 
gence of s Bi, 

7= 
which will ensure the convergence of oe Bis o 


j=o 


The selection of T (or even y/) 


difficult problem. Even after this is done the com- 
putation of #™ can be formidable. Consequently, 
we propose the following simple alternative; and 
we seek an expression for an upper bound of error 
instead of calculating the error itself. In practice, 
this is what one usually needs. 

Referring to (4.19), let us denote the inverse of 
the matrix (A+T) by D. Thus, 


(ASS Dede (4.21) 


Then the r-th row of = (m>1) will be given by 


En = der 1 Ex) + dia Yo EoD 
=e cee a thine Yn Ee 
Pl De lohmer eeeeeedee 


for which the following inequality holds, 


| < [dank |+| dens | 
lr Ara yee ed 
< lym |{|da|+]de|+.. 
+ | den |} | Exc | , 


(4.22) 


where |-yar| and [Ene | are the maximum values 
of the |y;|’s and the [Em |g ¢= 1, 233) 
respectively. Since (4.22) holds for r=1, 2, 3,..., 
n, it will hold if the left hand side is replaced by 


| Ene | = mae} | Es ea! | ee ere 
Therefore, we may write 

| éy™ | < | YM i ps | dy; Ey) , 

genial | Dov sara egaie (4.23) 


’ > | dy |, 


Let du = max pa | dy; Pay S | dni \, 


and let us require that the product 


yw \dwis Ae (4.24) 
or, in the case of equal y,’s, 
ly |du = K <1. (4.25) 


K will be referred to as the convergence constant. 

These conditions can always be satisfied by 
choosing yw or y sufficiently small. To be sure, 
small values of y may result in only a slight im- 
provement of the matrix condition. Consequently, 
a compromise has to be made which depends on 
the degree of ill-conditionedness of the original 
set of equations, the round-off errors injected 
during the computation, cost of machine time, and 
the accuracy required in the final result. 

Returning to (4.22) and (4.24), we have for 
mls 


Ey™ 


(4.26) 


where it is assumed that &y°"-) 40. Therefore; 


by a well-known theorem, the series Ds éu™ is 
absolutely convergent. “7; 


From the foregoing discussion it is seen that, 
in some instances, it is possible to relax the restric- 
tion imposed by (4.24) or (4.25), i.e., to allow 
|| dar to go slightly beyond unity, yet (4.26) is 
still satisfied. Example 2 in the following section 
is selected to demonstrate this point. 

Consider now the expression for each individual 
component of the matrix X, namely, 


=P HO= SEO + Y 6M, r= 1,2,8,. 


j=m-+1 
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The error committed after m iterations is given by 


(ee) 


m 
e,™ a 2, £,@ sate S E, Le Uy 2, 3, sen y N, 
j=1 j=m+1 
and 
2 . 
ee > |, r= 1,2,3,...,0. 
j=m+1 


However, in view of the fact that || < |&.| 
and that (4.26) obtains, we have 


Lem |< Klivm|A+K4+ K2+...) 
K 


= | Ey™ | ; 


es (4.27) 


=lor2or3...orn. Thus, the absolute values 
of the errors committed after m iterations are 
bounded by | Ene™ -K/A—-K). 

In conclusion, the authors wish to emphasize 
the fact that the procedure proposed in the paper 
does not exclude the use of any refined programs 
of computer calculation, such as the double preci- 
sion routine, when the modified set (4.18) is solved. 
On the other hand, such refined computation pro- 
cedure is not a substitute for the present method, 
which hinges on an appropriate modification of the 
coefficient matrix and iteration with progressive 
displacements of the vector Z. Rather it supple- 
ments the many known methods of solving linear 
systems when the latter are ill-conditioned. 
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